-
Notifications
You must be signed in to change notification settings - Fork 254
CNS API contracts for NUMA-Aware Pods #3825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
71fa67a
to
0564b6d
Compare
7b6d80b
to
82b11b0
Compare
c9a2fde
to
37cf8f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces two new REST API endpoints for managing InfiniBand (IB) devices in Azure Container Network Service (CNS), enabling NUMA-aware pod device assignment.
- Adds PUT endpoint to assign specific IB devices to pods via MAC addresses
- Adds GET endpoint to query the status and assignment of individual IB devices
- Defines error codes and device status enums for the IB device management system
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
cns/types/infiniband/status.go | Defines device status constants for IB device lifecycle states |
cns/types/infiniband/errorcodes.go | Defines error codes for IB device operations |
cns/swagger.yaml | Adds OpenAPI specification for the two new IB device endpoints |
cns/api.go | Defines Go structs for API request/response contracts |
Comments suppressed due to low confidence (1)
cns/swagger.yaml:186
- The schema name 'GetIBDeviceStatusResponse' is inconsistent with the endpoint naming convention. Based on the API path '/ibdevices/{macaddress}', it should be 'GetIBDeviceInfoResponse' to match the PR description and maintain consistency.
$ref: "#/components/schemas/GetIBDeviceStatusResponse"
Waiting for #3876 to merge first before merging this |
dd872cf
to
ac97074
Compare
Alright @rbtr , I've made some changes, I just have 2 remaining questions:
|
Hi @rbtr , lmk what you think now with the latest changes, thanks |
Exactly, if you don't have an action in the URI, the implicit contract is that POSTing to
There's maybe a race here, since namespaced name is not unique. A pod could get a stale MTPNC? |
Yeah that's a good point, the ibdevices are provisioned at the Azure host level
Yes you're right, we could have a race condition met here Lemme see what we do today (cuz this already happens) |
Alright @rbtr , I ran some tests, and something that changes even when pods get the same names, is We could make this a required parameter for the POST API
Wdyt? |
who creates the MTPNC? why don't we set an ownerref to the Pod so it gets GCed? |
Today, DNC-RC creates the MTPNC, however, in this new NUMA scenario, CNS will create the MTPNC instead
For this new NUMA scenario, we could do that
I don't know Basically @rbtr , is there anything else you want different about this PR, besides adding an ownerref to the pod so it is GCed? (that anyways will come in a future PR when I actually implement the APIs, this PR is to establish the contract between the partner team and us) |
Hey wait we may actually set an ownerref already (I probably just missed it), here is one example MTPNC that DNC-RC already creates today: apiVersion: multitenancy.acn.azure.com/v1alpha1
kind: MultitenantPodNetworkConfig
metadata:
creationTimestamp: "2025-05-29T17:51:35Z"
finalizers:
- finalizers.acn.azure.com/dnc-operations
generation: 1
name: ibpod1-deployment-5dbf8f965-sdgzr
namespace: e2e-ns-d0s9t9ibfg06a1lnpsog
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Pod
name: ibpod1-deployment-5dbf8f965-sdgzr
uid: 18a8b6b2-b247-4ae6-a351-5d56a203eab4
resourceVersion: "14083"
uid: 817a229e-24cd-4efc-b37c-75d4ff48cb5f Looking more in the code to see where we do that |
Yeah I found where it is added, I sent it to you privately |
So, we know there's an owner ref on the MTPNC, of the Pod Are you saying, we should also add, an ownerref on the pod, of the MTPNC? |
This pull request is stale because it has been open for 2 weeks with no activity. Remove stale label or comment or this will be closed in 7 days |
1539d9a
to
a045606
Compare
Hi @rbtr , Kshitija and I were talking and on the POST API, if the pod has 7 or 8 macs, the URL will get really long.. 😅 Can I move them into arequest body instead? 😅 {
"MACs": ["e6:ed:09:b1:97:7e",
"ea:2a:3f:19:fd:49",
"e9:b1:8b:f9:d1:e0",
"00:7b:34:84:c5:0f",
"5f:26:ba:f7:2b:41",
"07:81:05:db:2a:36",
"a8:e9:a1:50:3b:2a"],
"podnamespace": "mynamespace"
"podname": "mypod"
} |
5287c32
to
ad00d82
Compare
/azp run Azure Container Networking PR |
Azure Pipelines successfully started running 1 pipeline(s). |
CNS IBDevice API Contracts
Overview
Two new APIs for managing InfiniBand (IB) devices in Azure Container Network Service (CNS):
API 1: POST IB Devices for Pod
Endpoint
Request Body
Success Response
Error Response
HTTP 400
API 2: GET IB Device Information
Endpoint
Request
No request body (MAC address provided as query param)
Success Response
Device Not Found Response
HTTP 404
See
status.go
in this PR for list of all statuses of an IB deviceRequest body
See
api.go
for the structsAssignIBDevicesToPodRequest
AssignIBDevicesToPodResponse
GetIBDeviceInfoResponse